Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

نویسندگان

  • Saikat Maitra
  • Jun Yan
چکیده

_____________________________________________________________________________ Abstract: Dimension reduction is one of the major tasks for multivariate analysis, it is especially critical for multivariate regressions in many P&C insurance-related applications. In this paper, we’ll present two methodologies, principle component analysis (PCA) and partial least squares (PLC), for dimension reduction in a case that the independent variables used in a regression are highly correlated. PCA, as a dimension reduction methodology, is applied without the consideration of the correlation between the dependent variable and the independent variables, while PLS is applied based on the correlation. Therefore, we call PCA as an unsupervised dimension reduction methodology, and call PLS as a supervised dimension reduction methodology. We’ll describe the algorithms of PCA and PLS, and compare their performances in multivariate regressions using simulated data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Local Dimensionality Reduction

If globally high dimensional data has locally only low dimensional distributions, it is advantageous to perform a local dimensionality reduction before further processing the data. In this paper we examine several techniques for local dimensionality reduction in the context of locally weighted linear regression. As possible candidates, we derive local versions of factor analysis regression, pri...

متن کامل

Boiling Points Predictions Study via Dimension Reduction Methods: SIR, PCR and PLSR

Variable selection is an important tool in QSAR. In this article, we employ three known techniques: sliced inverse regression (SIR), principal components regression (PCR) and partial least squares regression (PLSR) for models to predict the boiling points of 530 saturated hydrocarbons. With 122 topological indices as input variables our results show that these three methods have good performanc...

متن کامل

STA 4107/5107 Statistical Learning: Principle Components and Partial Least Squares Regression

Principal components analysis is traditionally presented as an interpretive multivariate technique, where the loadings are chosen to maximally explain the variance in the variable. However, we will consider it here mainly as a statistical learning tool, by using the derived components in a least squares regression to predict unobserved response variables using the principal components. Principa...

متن کامل

Dimension reduction for classification with gene expression microarray data.

An important application of gene expression microarray data is classification of biological samples or prediction of clinical and other outcomes. One necessary part of multivariate statistical analysis in such applications is dimension reduction. This paper provides a comparison study of three dimension reduction techniques, namely partial least squares (PLS), sliced inverse regression (SIR) an...

متن کامل

Projection Penalties: Dimension Reduction without Loss

Dimension reduction is popular for learning predictive models in high-dimensional spaces. It can highlight the relevant part of the feature space and avoid the curse of dimensionality. However, it can also be harmful because any reduction loses information. In this paper, we propose the projection penalty framework to make use of dimension reduction without losing valuable information. Reducing...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008